Audio-visual anticipatory coarticulation modeling by human and machine

نویسندگان

  • Louis H. Terry
  • Karen Livescu
  • Janet B. Pierrehumbert
  • Aggelos K. Katsaggelos
چکیده

The phenomenon of anticipatory coarticulation provides a basis for the observed asynchrony between the acoustic and visual onsets of phones in certain linguistic contexts. This type of asynchrony is typically not explicitly modeled in audio-visual speech models. In this work, we study within-word audiovisual asynchrony using manual labels of words in which theory suggests that audio-visual asynchrony should occur, and show that these hand labels confirm the theory. We then introduce a new statistical model of audio-visual speech, the asynchronydependent transition (ADT) model. This model allows asynchrony between audio and video states within word boundaries, where the audio and video state transitions depend not only on the state of that modality, but also on the instantaneous asynchrony. The ADT model outperforms a baseline synchronous model in mimicking the hand labels in a forced alignment task, and its behavior as parameters are changed conforms to our expectations about anticipatory coarticulation. The same model could be used for speech recognition, although here we consider it only for the task of forced alignment for linguistic analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The effect of cognitive load on tonal coarticulation

Cognitive load (CL) has been found to influence language perception in many interesting ways, but its role in production has not been explored. In this paper, we look at how CL influences production of tonal coarticulation in Mandarin Chinese. Since coarticulation has been found to involve cognitive planning, this is an especially appropriate domain for investigating the influence of CL. Result...

متن کامل

INTERSPEECH 2006 1 sing Dominance Functions and udio - Visual Speech Synthesis

This paper presents results of training of coarticulation models for Czech audio-visual speech synthesis. Two approaches for solution of coarticulation in audio-visual speech synthesis were used, coarticulation based on dominance functions and visual unit selection. For both approaches, coarticulation models were trained. Models for unit selection approach were trained by visualy clustered data...

متن کامل

Immediate effects of anticipatory coarticulation in spoken-word recognition.

Two visual-world experiments examined listeners' use of pre word-onset anticipatory coarticulation in spoken-word recognition. Experiment 1 established the shortest lag with which information in the speech signal influences eye-movement control, using stimuli such as "The … ladder is the target". With a neutral token of the definite article preceding the target word, saccades to the referent we...

متن کامل

On the causes of compensation for coarticulation: evidence for phonological mediation.

This study examined whether compensation for coarticulation in fricative-vowel syllables is phonologically mediated or a consequence of auditory processes. Smits (2001a) had shown that compensation occurs for anticipatory lip rounding in a fricative caused by a following rounded vowel in Dutch. In a first experiment, the possibility that compensation is due to general auditory processing was in...

متن کامل

Coarticulation Is Largely Planned *

Is coarticulation a reflection of planning an utterance or the automatic effect of producing speech? This question was examined by requiring speakers to begin reading nonsense aCV or abVCa strings (where C was fbi or IpI and V was Iii or lui) aloud before seeing the entire utterance. Those segments known before articulation began exerted normal anticipatory coarticulatory influences, while thos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010